Hardware Supports for Efficient Barrier Synchronization on 2-D Mesh Networks
نویسندگان
چکیده
In this papel: we consider a hardware scheme for supporting barrier synchronization on scalable systems with a .?D mesh network. Our design takes into account oftheprogram execution path in such systems __from programming interfaces down to routers. The hardware router design will De based on the MPI-1 standard. A distributed algorithm I S proposed to construct a collective synchronization tree (CS tree) from the nodes participating in the barrier: Based upon the CS tree, the status registers in the routers are set up and synchronization messages are transmitted along the paths set by the status registers. Performance evaluations show that our proposed method has better performance for barrier synchronization and is less sensitive to variations in group size and startup delay than previous approaches. Ayoweve< our scheme has the extra overhead of building the CS tree. Thus it is more suitable for parallel iterative computations, in which the same barrier is invoked repetitively.
منابع مشابه
Turn Grouping for Efficient Barrier Synchronization in Wormhole Mesh Networks
Barrier is an important synchronization operation. On scalable parallel computers, it is often implemented as a collective communication with a reduction operation followed by a distribution operation. In this paper, we introduce a systematic way of generating efficient algorithms to perform barrier synchronization in mesh networks. The scheme works with any base routing algorithm derivable fro...
متن کاملAn efficient routing methodology to tolerate static and dynamic faults in 2-D mesh networks-on-chip
The move towards nanoscale Integrated Circuits (ICs) increases performance and capacity, but poses process variation and reliability challenges which may cause several faults on routers in Networks-on-Chips (NoCs). While utilizing healthy routers in an NoC is desirable, faulty regions with different shapes are formed gathering faulty routers. Fault regions can be used to lead the fault-tolerant...
متن کاملA Hybrid Time Synchronization Implemented Through Special Ring Array for Mesh or Torus
In this paper, we present a new efficient hybrid time synchronization scheme for a mesh or torus interconnection networks, called ROCTS. ROCTS comprises two levels, one for the lower level that is implemented over a special high-speed ring array, one for the mesh or torus network. In ROCTS, the second network we construct is different from the past, which is a ring array with each ring not conn...
متن کاملCooperative communication based barrier synchronization in on-chip mesh architectures
We propose cooperative communication as a means to enable efficient and scalable barrier synchronization on mesh-based manycore architectures. Our approach is different from but orthogonal to conventional algorithm-based optimizations. It relies on collaborating routers to provide efficient gather and multicast communication. In conjunction with a master-slave algorithm, it exploits the mesh re...
متن کاملEfficient Implementation of Barrier Synchronization in Wormhole-Routed Hypercube Multicomputers
This paper addresses eecient implementation of barrier synchronization in wormhole-routed hypercube multicomputers. For those systems supporting only unicast communication in hardware, a novel software tree approach, the U-cube tree, is proposed. An important feature of the U-cube tree is that all messages injected into the network are guaranteed to be contention-free. Performance measurements ...
متن کامل